RNA-Seq Accurately Identifies Cancer Biomarker Signatures to Distinguish Tissue of Origin1

نویسندگان

  • Iris H. Wei
  • Yang Shi
  • Hui Jiang
  • Chandan Kumar-Sinha
  • Arul M. Chinnaiyan
چکیده

Metastatic cancer of unknown primary (CUP) accounts for up to 5% of all new cancer cases, with a 5-year survival rate of only 10%. Accurate identification of tissue of origin would allow for directed, personalized therapies to improve clinical outcomes. Our objective was to use transcriptome sequencing (RNA-Seq) to identify lineage-specific biomarker signatures for the cancer types that most commonly metastasize as CUP (colorectum, kidney, liver, lung, ovary, pancreas, prostate, and stomach). RNA-Seq data of 17,471 transcripts from a total of 3,244 cancer samples across 26 different tissue types were compiled from in-house sequencing data and publically available International Cancer Genome Consortium and The Cancer Genome Atlas datasets. Robust cancer biomarker signatures were extracted using a 10-fold cross-validation method of log transformation, quantile normalization, transcript ranking by area under the receiver operating characteristic curve, and stepwise logistic regression. The entire algorithm was then repeated with a new set of randomly generated training and test sets, yielding highly concordant biomarker signatures. External validation of the cancer-specific signatures yielded high sensitivity (92.0% ± 3.15%; mean ± standard deviation) and specificity (97.7% ± 2.99%) for each cancer biomarker signature. The overall performance of this RNA-Seq biomarker-generating algorithm yielded an accuracy of 90.5%. In conclusion, we demonstrate a computational model for producing highly sensitive and specific cancer biomarker signatures from RNA-Seq data, generating signatures for the top eight cancer types responsible for CUP to accurately identify tumor origin.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward reliable biomarker signatures in the age of liquid biopsies - how to standardize the small RNA-Seq workflow

Small RNA-Seq has emerged as a powerful tool in transcriptomics, gene expression profiling and biomarker discovery. Sequencing cell-free nucleic acids, particularly microRNA (miRNA), from liquid biopsies additionally provides exciting possibilities for molecular diagnostics, and might help establish disease-specific biomarker signatures. The complexity of the small RNA-Seq workflow, however, be...

متن کامل

Upregulation of HOTAIR Transcript Level in Tumor Tissue of Iranian Women with Breast Cancer

Background:Dysregulation of HOX Transcript Antisense Intergenic RNA (HOTAIR) has been linked to the etiopathogenesis of several human cancers. According to epidemiological evidences, the risk of susceptibility to breast cancer varies among different populations. This study was designed to determine the transcriptional level of HOTAIR in tumor cells of breast cancer pat...

متن کامل

Large-scale RNA-Seq Transcriptome Analysis of 4043 Cancers and 548 Normal Tissue Controls across 12 TCGA Cancer Types

The Cancer Genome Atlas (TCGA) has accrued RNA-Seq-based transcriptome data for more than 4000 cancer tissue samples across 12 cancer types, translating these data into biological insights remains a major challenge. We analyzed and compared the transcriptomes of 4043 cancer and 548 normal tissue samples from 21 TCGA cancer types, and created a comprehensive catalog of gene expression alteration...

متن کامل

Evaluating the clinical importance of long-non coding RNA MALAT1 expression in breast cancer

Background: Breast cancer is one of the major causes of illness and mortality among women. Long non-coding RNAs (LncRNAs) have important role in tumor development and progression. Metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) is a lncRNA that deregulates in several cancers, however, its value in the diagnosis of breast cancer is unclear. This study was conducted to investigate...

متن کامل

Interpreting Personal Transcriptomes: Personalized Mechanism-Scale Profiling of RNA-seq Data

Despite thousands of reported studies unveiling gene-level signatures for complex diseases, few of these techniques work at the single-sample level with explicit underpinning of biological mechanisms. This presents both a critical dilemma in the field of personalized medicine as well as a plethora of opportunities for analysis of RNA-seq data. In this study, we hypothesize that the "Functional ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 16  شماره 

صفحات  -

تاریخ انتشار 2014